Query Selectivity Estimation Based on Improved V-optimal Histogram by Introducing Information about Distribution of Boundaries of Range Query Conditions
نویسنده
چکیده
Selectivity estimation is a parameter used by a query optimizer for early estimation of the size of data that satisfies query condition. Selectivity is calculated using an estimator of distribution of attribute values of attribute involved in a processed query condition. Histograms built on attributes values from a database may be such representation of the distribution. The paper introduces a new query-distributionaware V-optimal histogram which is useful in selectivity estimation for a range query. It takes into account either a 1-D distribution of attribute values or a 2-D distribution of boundaries of already processed queries. The advantages of qda-V-optimal histogram appears when it is applied for selectivity estimation of range query conditions that form so-called hot regions. To obtain the proposed error-optimal histogram we use dynamic programming method, Fuzzy C-Means clustering of a set of range boundaries.
منابع مشابه
Query-Condition-Aware Histograms in Selectivity Estimation Method
The paper shows an adaptive approach to the query selectivity estimation problem for queries with a range selection condition based on continuous attributes. The selectivity factor estimates a size of data satisfying a query condition. This estimation is calculated at the initial stage of the query processing for choosing the optimal query execution plan. A non-parametric estimator of probabili...
متن کاملOptimal Histograms for Hierarchical Range Queries Extended Abstract
Now there is tremendous interest in data warehousing and OLAP applications. OLAP applications typically view data as having multiple logical dimensions (e.g., product, location) with natural hierarchies de ned on each dimension, and analyze the behavior of various measure attributes (e.g., sales, volume) in terms of the dimensions. OLAP queries typically involve hierarchical selections on some ...
متن کاملIntegrating Query-Feedback Based Statistics into Informix Dynamic Server
Statistics that accurately describe the distribution of data values in the columns of relational tables are essential for effective query optimization in a database management system. Manually maintaining such statistics in the face of changing data is difficult and can lead to suboptimal query performance and high administration costs. In this paper, we describe a method and prototype implemen...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملSummarizing Spatial Relations - A Hybrid Histogram
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this paper, we examine the selectivity estimation for range window query to summarize the four important topological relations: contains, contained, overlap, and disjoint. We propose a novel hybrid histogrammethod which uses the concept of Min-skew partition in conjunction with...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014